Overview

Dataset statistics

Number of variables14
Number of observations4375
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)< 0.1%
Total size in memory478.6 KiB
Average record size in memory112.0 B

Variable types

NUM9
BOOL4
CAT1

Warnings

Dataset has 2 (< 0.1%) duplicate rows Duplicates
Seniority has 520 (11.9%) zeros Zeros
Income has 336 (7.7%) zeros Zeros
Assets has 1619 (37.0%) zeros Zeros
Debt has 3610 (82.5%) zeros Zeros

Reproduction

Analysis started2020-09-27 14:11:42.588150
Analysis finished2020-09-27 14:12:14.726736
Duration32.14 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

Seniority
Real number (ℝ≥0)

ZEROS

Distinct47
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.985142857
Minimum0
Maximum48
Zeros520
Zeros (%)11.9%
Memory size34.2 KiB
2020-09-27T11:12:14.988219image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q312
95-th percentile25
Maximum48
Range48
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.17392156
Coefficient of variation (CV)1.023641243
Kurtosis1.847950231
Mean7.985142857
Median Absolute Deviation (MAD)4
Skewness1.388399248
Sum34935
Variance66.81299366
MonotocityNot monotonic
2020-09-27T11:12:15.293777image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%) 
052011.9%
 
150411.5%
 
244810.2%
 
33287.5%
 
52616.0%
 
102305.3%
 
42305.3%
 
61794.1%
 
81613.7%
 
151543.5%
 
Other values (37)136031.1%
 
ValueCountFrequency (%) 
052011.9%
 
150411.5%
 
244810.2%
 
33287.5%
 
42305.3%
 
ValueCountFrequency (%) 
481< 0.1%
 
471< 0.1%
 
4530.1%
 
432< 0.1%
 
421< 0.1%
 

Home
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size34.2 KiB
1
2311 
0
2064 
ValueCountFrequency (%) 
1231152.8%
 
0206447.2%
 
2020-09-27T11:12:15.499130image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Time
Real number (ℝ≥0)

Distinct11
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.51885714
Minimum6
Maximum72
Zeros0
Zeros (%)0.0%
Memory size34.2 KiB
2020-09-27T11:12:15.643113image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile18
Q136
median48
Q360
95-th percentile60
Maximum72
Range66
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.61928212
Coefficient of variation (CV)0.3142657197
Kurtosis-0.4207283847
Mean46.51885714
Median Absolute Deviation (MAD)12
Skewness-0.7752061897
Sum203520
Variance213.7234098
MonotocityNot monotonic
2020-09-27T11:12:15.858957image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%) 
60190743.6%
 
3692621.2%
 
4884319.3%
 
243387.7%
 
121463.3%
 
18892.0%
 
30481.1%
 
6310.7%
 
42290.7%
 
54170.4%
 
ValueCountFrequency (%) 
6310.7%
 
121463.3%
 
18892.0%
 
243387.7%
 
30481.1%
 
ValueCountFrequency (%) 
721< 0.1%
 
60190743.6%
 
54170.4%
 
4884319.3%
 
42290.7%
 

Age
Real number (ℝ≥0)

Distinct50
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.02171429
Minimum18
Maximum68
Zeros0
Zeros (%)0.0%
Memory size34.2 KiB
2020-09-27T11:12:16.139290image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile22
Q128
median36
Q345
95-th percentile57
Maximum68
Range50
Interquartile range (IQR)17

Descriptive statistics

Standard deviation10.97318652
Coefficient of variation (CV)0.2963986604
Kurtosis-0.6069935264
Mean37.02171429
Median Absolute Deviation (MAD)8
Skewness0.4922603619
Sum161970
Variance120.4108224
MonotocityNot monotonic
2020-09-27T11:12:16.508088image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
281764.0%
 
261623.7%
 
321573.6%
 
301563.6%
 
341543.5%
 
241523.5%
 
271523.5%
 
311493.4%
 
361433.3%
 
331413.2%
 
Other values (40)283364.8%
 
ValueCountFrequency (%) 
1880.2%
 
19270.6%
 
20471.1%
 
21811.9%
 
221112.5%
 
ValueCountFrequency (%) 
682< 0.1%
 
6690.2%
 
65120.3%
 
64130.3%
 
63220.5%
 

Marital
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size34.2 KiB
0
3189 
1
1186 
ValueCountFrequency (%) 
0318972.9%
 
1118627.1%
 
2020-09-27T11:12:16.694888image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Records
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size34.2 KiB
1
3622 
2
753 
ValueCountFrequency (%) 
1362282.8%
 
275317.2%
 
2020-09-27T11:12:16.843253image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-09-27T11:12:16.975598image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:17.116857image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

Job
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size34.2 KiB
0
2782 
1
1593 
ValueCountFrequency (%) 
0278263.6%
 
1159336.4%
 
2020-09-27T11:12:17.267957image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Expenses
Real number (ℝ≥0)

Distinct93
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.5952
Minimum35
Maximum173
Zeros0
Zeros (%)0.0%
Memory size34.2 KiB
2020-09-27T11:12:17.460887image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum35
5-th percentile35
Q135
median51
Q372
95-th percentile90
Maximum173
Range138
Interquartile range (IQR)37

Descriptive statistics

Standard deviation19.45156894
Coefficient of variation (CV)0.3498785676
Kurtosis1.087861206
Mean55.5952
Median Absolute Deviation (MAD)16
Skewness0.9665142304
Sum243229
Variance378.3635343
MonotocityNot monotonic
2020-09-27T11:12:17.767280image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
35118827.2%
 
4580618.4%
 
6077217.6%
 
7555912.8%
 
901904.3%
 
105400.9%
 
47280.6%
 
54270.6%
 
46250.6%
 
42230.5%
 
Other values (83)71716.4%
 
ValueCountFrequency (%) 
35118827.2%
 
371< 0.1%
 
382< 0.1%
 
3940.1%
 
4040.1%
 
ValueCountFrequency (%) 
1731< 0.1%
 
1681< 0.1%
 
1651< 0.1%
 
1531< 0.1%
 
1501< 0.1%
 

Income
Real number (ℝ≥0)

ZEROS

Distinct350
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean130.8489143
Minimum0
Maximum959
Zeros336
Zeros (%)7.7%
Memory size34.2 KiB
2020-09-27T11:12:18.071475image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q180
median120
Q3165
95-th percentile280
Maximum959
Range959
Interquartile range (IQR)85

Descriptive statistics

Standard deviation86.19951805
Coefficient of variation (CV)0.6587713663
Kurtosis10.22090803
Mean130.8489143
Median Absolute Deviation (MAD)40
Skewness2.01985208
Sum572464
Variance7430.356912
MonotocityNot monotonic
2020-09-27T11:12:18.482834image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
03367.7%
 
1001523.5%
 
1501132.6%
 
1201072.4%
 
80872.0%
 
110851.9%
 
90831.9%
 
200761.7%
 
140751.7%
 
125741.7%
 
Other values (340)318772.8%
 
ValueCountFrequency (%) 
03367.7%
 
61< 0.1%
 
81< 0.1%
 
161< 0.1%
 
171< 0.1%
 
ValueCountFrequency (%) 
9591< 0.1%
 
9051< 0.1%
 
8571< 0.1%
 
8301< 0.1%
 
8001< 0.1%
 

Assets
Real number (ℝ≥0)

ZEROS

Distinct158
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5298.431314
Minimum0
Maximum250000
Zeros1619
Zeros (%)37.0%
Memory size34.2 KiB
2020-09-27T11:12:19.074148image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3000
Q36000
95-th percentile18000
Maximum250000
Range250000
Interquartile range (IQR)6000

Descriptive statistics

Standard deviation10582.37737
Coefficient of variation (CV)1.997266122
Kurtosis122.1639461
Mean5298.431314
Median Absolute Deviation (MAD)3000
Skewness8.359312289
Sum23180637
Variance111986710.7
MonotocityNot monotonic
2020-09-27T11:12:19.482406image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0161937.0%
 
40003427.8%
 
50002816.4%
 
30002706.2%
 
35001924.4%
 
60001744.0%
 
80001313.0%
 
100001182.7%
 
70001152.6%
 
25001102.5%
 
Other values (148)102323.4%
 
ValueCountFrequency (%) 
0161937.0%
 
181< 0.1%
 
3001< 0.1%
 
4501< 0.1%
 
50060.1%
 
ValueCountFrequency (%) 
2500001< 0.1%
 
2000001< 0.1%
 
1500001< 0.1%
 
1200001< 0.1%
 
1100001< 0.1%
 

Debt
Real number (ℝ≥0)

ZEROS

Distinct181
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean342.5515429
Minimum0
Maximum30000
Zeros3610
Zeros (%)82.5%
Memory size34.2 KiB
2020-09-27T11:12:19.809954image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2500
Maximum30000
Range30000
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1217.6228
Coefficient of variation (CV)3.554568138
Kurtosis150.7841378
Mean342.5515429
Median Absolute Deviation (MAD)0
Skewness9.199789288
Sum1498663
Variance1482605.283
MonotocityNot monotonic
2020-09-27T11:12:20.133969image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0361082.5%
 
2000591.3%
 
1500511.2%
 
3000451.0%
 
1000390.9%
 
2500260.6%
 
500250.6%
 
400200.5%
 
4000200.5%
 
200200.5%
 
Other values (171)46010.5%
 
ValueCountFrequency (%) 
0361082.5%
 
12< 0.1%
 
101< 0.1%
 
121< 0.1%
 
251< 0.1%
 
ValueCountFrequency (%) 
300001< 0.1%
 
235001< 0.1%
 
214001< 0.1%
 
155001< 0.1%
 
150001< 0.1%
 

Amount
Real number (ℝ≥0)

Distinct282
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1037.463771
Minimum100
Maximum4500
Zeros0
Zeros (%)0.0%
Memory size34.2 KiB
2020-09-27T11:12:20.464756image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile350
Q1700
median1000
Q31300
95-th percentile1800
Maximum4500
Range4400
Interquartile range (IQR)600

Descriptive statistics

Standard deviation469.7535311
Coefficient of variation (CV)0.4527902988
Kurtosis3.51701542
Mean1037.463771
Median Absolute Deviation (MAD)300
Skewness1.022272339
Sum4538904
Variance220668.38
MonotocityNot monotonic
2020-09-27T11:12:20.817805image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
100053412.2%
 
8002175.0%
 
12002154.9%
 
11002094.8%
 
9001964.5%
 
13001944.4%
 
15001934.4%
 
6001844.2%
 
5001804.1%
 
7001683.8%
 
Other values (272)208547.7%
 
ValueCountFrequency (%) 
10050.1%
 
1051< 0.1%
 
1071< 0.1%
 
1201< 0.1%
 
12530.1%
 
ValueCountFrequency (%) 
45001< 0.1%
 
400030.1%
 
39001< 0.1%
 
38751< 0.1%
 
38002< 0.1%
 

Price
Real number (ℝ≥0)

Distinct1403
Distinct (%)32.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1459.732343
Minimum105
Maximum11140
Zeros0
Zeros (%)0.0%
Memory size34.2 KiB
2020-09-27T11:12:21.118623image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum105
5-th percentile600
Q11116.5
median1400
Q31688
95-th percentile2432.3
Maximum11140
Range11035
Interquartile range (IQR)571.5

Descriptive statistics

Standard deviation621.7759951
Coefficient of variation (CV)0.4259520577
Kurtosis26.4458555
Mean1459.732343
Median Absolute Deviation (MAD)286
Skewness2.945035832
Sum6386329
Variance386605.3881
MonotocityNot monotonic
2020-09-27T11:12:21.527492image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1500461.1%
 
1200451.0%
 
1300451.0%
 
1600431.0%
 
1100380.9%
 
1700380.9%
 
1400350.8%
 
1350310.7%
 
950310.7%
 
800290.7%
 
Other values (1393)399491.3%
 
ValueCountFrequency (%) 
1051< 0.1%
 
1252< 0.1%
 
1751< 0.1%
 
2001< 0.1%
 
2251< 0.1%
 
ValueCountFrequency (%) 
111401< 0.1%
 
88001< 0.1%
 
69001< 0.1%
 
68021< 0.1%
 
67001< 0.1%
 

Status
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size34.2 KiB
1
3159 
0
1216 
ValueCountFrequency (%) 
1315972.2%
 
0121627.8%
 
2020-09-27T11:12:21.791493image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Interactions

2020-09-27T11:11:52.187553image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:52.480831image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:52.729966image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:53.071590image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:53.347428image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:53.570394image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:53.794059image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:54.017973image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:54.258080image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:54.482084image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:54.717543image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:54.950169image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:55.265788image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:55.566125image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:55.830990image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:56.116809image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:56.500603image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:56.735775image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:56.965787image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:57.200323image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:57.426409image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:57.663670image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:57.885421image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:58.133756image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:58.368073image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:58.625308image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:58.857678image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:59.118485image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:59.377986image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:59.617335image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:11:59.845931image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:00.104052image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:00.475211image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:00.727555image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:00.971848image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:01.211823image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:01.471515image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:01.719427image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:02.007270image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:02.378209image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:02.667330image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:02.919816image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:03.250681image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:03.492796image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:03.733953image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:03.989974image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:04.377692image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:04.657313image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:04.916353image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:05.292014image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:05.550092image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:05.779234image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:06.011236image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:06.294082image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:06.518258image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:06.745574image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:06.988138image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:07.301151image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:07.547472image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:07.797728image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:08.071078image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:08.345957image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:08.596271image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:08.864553image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:09.087104image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:09.324443image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:09.564901image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:09.866407image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:10.269006image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:10.491152image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:10.725381image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:10.931597image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:11.231960image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:11.494977image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:11.819674image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:12.067487image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:12.343900image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:12.565795image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:12.797830image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:13.021888image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:13.264192image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-09-27T11:12:22.023210image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-09-27T11:12:22.724699image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-09-27T11:12:23.239442image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-09-27T11:12:23.792349image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-09-27T11:12:13.783448image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-27T11:12:14.492652image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

SeniorityHomeTimeAgeMaritalRecordsJobExpensesIncomeAssetsDebtAmountPriceStatus
09.01.060.030.00.01.01.073.0129.00.00.0800.0846.01
117.01.060.058.01.01.00.048.0131.00.00.01000.01658.01
210.00.036.046.00.02.01.090.0200.03000.00.02000.02985.00
30.01.060.024.01.01.00.063.0182.02500.00.0900.01325.01
40.01.036.026.01.01.00.046.0107.00.00.0310.0910.01
51.00.060.036.00.01.00.075.0214.03500.00.0650.01645.01
629.00.060.044.00.01.00.075.0125.010000.00.01600.01800.01
79.01.012.027.01.01.00.035.080.00.00.0200.01093.01
80.00.060.032.00.01.01.090.0107.015000.00.01200.01957.01
90.01.048.041.00.01.01.090.080.00.00.01200.01468.00

Last rows

SeniorityHomeTimeAgeMaritalRecordsJobExpensesIncomeAssetsDebtAmountPriceStatus
43651.00.060.031.01.02.00.035.0242.09000.00.01500.01656.01
43666.01.060.022.01.02.00.035.0100.00.00.01200.01496.00
43676.00.048.052.00.01.00.045.0190.03500.00.01500.01905.01
43683.00.060.049.00.01.00.035.0160.03000.00.0900.0975.01
43691.01.048.030.00.02.01.075.077.00.00.01200.01300.00
43701.01.060.039.00.01.00.069.092.00.00.0900.01020.00
437122.00.060.046.00.01.00.060.075.03000.0600.0950.01263.01
43720.00.024.037.00.01.01.060.090.03500.00.0500.0963.00
43730.01.048.023.01.01.01.049.0140.00.00.0550.0550.01
43745.00.060.032.00.01.01.060.0140.04000.01000.01350.01650.01

Duplicate rows

Most frequent

SeniorityHomeTimeAgeMaritalRecordsJobExpensesIncomeAssetsDebtAmountPriceStatuscount
04.00.060.037.00.01.01.035.0128.018000.00.0800.01560.012
15.00.060.036.00.01.01.090.062.03000.00.0650.01295.002